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ABSTRACT 

Military  forces  of  the  future  will  use  mixed  manned  and 
unmanned  forces  for  a  broad  variety  of  functions. 
Measurement  of  overall  effectiveness  in  these  mixed  initiative 
systems  will  be  essential  in  order  to  achieve  optimal  system 
performance  levels.  Behavioral  measures  of  both  human  and 
unmanned  performance  obtained  in  system  simulations  or  in 
live  exercises  will  be  used  to  continuously  diagnose 
performance  and  identify  required  areas  of  training 
requirements.  Likewise,  specialized  training  will  be  necessary 
in  order  to  leverage  the  complementary  cognitive  functions  of 
human  and  machine  to  forge  fighting  entities  and  units  with 
capabilities  superior  to  those  of  humans  or  machines  in 
isolation.  Our  team  is  currently  developing  a  Mixed  Initiative 
Team  Performance  Assessment  System  (MITPAS)  consisting 
of  a  methodology,  tools  and  procedures  to  measure  the 
performance  of  mixed  manned  and  unmanned  teams  in  both 
training  and  real  world  operational  environments.  The  work  is 
being  performed  under  SBIR  Phase  I  and  II  contracts 
administered  by  RDECOM/STTC,  Orlando,  FL.  Our 
objective  is  to  provide  a  scalable  turnkey  MITPAS  software 
system  integrated  with  simulation  and  training  environments, 
utilizing  COTS  HLA  data  logging  tools  and  containing 
protocols  for  evaluation  of  various  manned/unmanned  team 
configurations  in  selected  event-based  scenarios.  This  paper 
describes  our  in-progress  development  of  a  underlying  Multi- 
Dimensional  Performance  Model,  our  preliminary  MITPAS 
architecture  and  our  Use  Case  Scenario  based  experimental 
and  evaluation  plan,  as  well  as  our  ideas  for  future 
applications  of  the  completed  MITPAS. 
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1.  INTRODUCTION 

Mixed  initiative  introduces  a  new  and  unique  aspect  to  the 
psychology  of  team  performance:  the  interaction  of  two 
cognitive  systems  —  human  and  autonomous  unmanned  robot. 
In  addition  to  the  critical  performance  factors  associated  with 
human  teams  —  which  include  information  exchange, 


communication,  supporting  behavior  and  team  leadership  — 
the  mixed  manned/unmanned  team  adds  a  number  of 
challenging  new  dimensions.  Foremost  among  these  is  the 
ability  of  the  human  team  to  manage,  predict,  collaborate  and 
develop  trust  with  unmanned  systems  that  may  sometimes 
exhibit  fuzzy  responses  in  unstructured  and  unpredictable 
environments  [1]  [2]  [3]  [4]  [8]  [9]. 

The  critical  challenge  in  our  work  has  been  to  develop  system- 
specific  measures  of  behavior  on  which  to  base  assessment  of 
the  mixed  initiative  team  performance.  Such  measures  must 
be  unique  to  the  information  and  decision  environment 
associated  with  human-robotic  teams  and  to  directly  link 
together  behavioral  processes  important  to  mixed 
manned/unmanned  tactical  outcomes.  The  measures  need  to 
provide  feedback  for  skill  improvement  in  collaboration  as 
well  as  adaptation  to  stress  and  workload,  and  they  should 
help  define  the  training  needs  themselves. 

2.  PERFORMANCE  MEASURES 

Our  work  on  the  definition  of  relevant  performance  measures 
began  with  the  realization  that  future  unmanned  platforms  will 
have  some  capability  to  operate  autonomously  within  the 
scope  of  their  mission  tasking,  but  will  be  continuously 
“commanded”  by  human  operators  who  will  each  direct  the 
activities  of  a  number  of  robots.  As  more  is  learned  about 
modeling  human  behavior,  increased  sophistication  in 
autonomous  operations  by  robotic  systems  can  be  expected  to 
reduce  dependence  on  human  supervisory  controllers.  At 
today’s  and  the  near- future  state  of  understanding,  however, 
certain  functions  are  not  well  supported  by  automation  and  can 
be  performed  at  a  much  higher  level  of  competence  by  human 
beings  in  collaboration  with  the  robotic  entities. 

Accordingly,  at  the  performance  level  there  are  new  human 
factors  issues  that  require  new  types  of  skills  and  training. 
These  emerge  from  the  nature  of  the  robots  as  decision¬ 
making  systems  operating  in  uncertain,  unpredictable  and 
unstructured  environments.  The  key  new  performance  issues 
include: 

•  Performing  supervisory  control  of  robots 
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•  Adapting  to  variations  in  the  level  autonomy  the  robots 
exhibit  in  response  to  environmental  and  task  variables 

•  Varying  task  allocation  to  exploit  the  distinct 
advantages  of  the  human  and  robotic  component  (e.g.,  a 
robot  can  endure  long  mission  duration,  survive  better 
but  may  have  only  about  80%  of  human  cognitive 
capabilities) 

•  Monitoring  of  robots’  decisions  and  actions  to  maintain 
to  achieve  transparency  of  robot  actions 

•  Overriding  robot  decisions  and  actions  when  necessary 

•  Helping  to  solve  problems  and  handle  contingencies 

Research  performed  to  date  on  measurement  of  team 
training  performance  has  focused  on  both  the  individual  and 
team  levels  [5]  [6].  It  is  recognized  that  while  both  process 
and  outcome  measures  are  essential,  training  feedback  mainly 
comes  from  process  measures.  The  guiding  principles  are:  (1) 
measurement  and  remediation  must  emphasize  processes  that 
are  linked  to  outcomes;  and  (2)  Individual  and  team  levels 
deficiencies  must  be  distinguished  to  support  the  instructional 
process.  In  our  view  these  principles  are  directly  applicable  to 
the  manned/unmanned  team  with  the  addition  of  another  level 
in  the  team  structure,  which  we  term  as  the  Collective 
Manned/Unmanned  (CMU)  level,  and  which  represents  the 
major  new  dimension  that  is  added  to  the  team  task 
characteristics  and  structure.  Our  selection  of  measuring 
instruments  and  speci-fication  of  associated  measurement 
methodologies  thus  extends  the  individual-team  matrix  of 
Cannon-Brower  [1]  to  include  the  present  case  of 
collaborative  manned/unmanned  teams. 


3.  PERFORMANCE  MODEL 

The  basis  of  our  MITPAS  approach  has  been  to  develop  a 
Manned/Unmanned  Team  Multi-Dimensional  Performance 
Model  that  captures  the  critical  performance  attributes  of  the 
distinct  human  and  robotic  decision  and  control  environment. 
Figure  1  below  provides  an  overview  of  the  hierarchical 
structure  of  the  Model’s  performance  dimensions. 

The  Performance  Model  we  are  developing  draws  on  four 
separate  research  areas  that  have  been  pursued  independently 
in  the  past  but  which  are  being  integrated  in  this  project  to 
establish  meaningful  criteria  of  overall  performance.  These 
research  areas  are: 

•  Psychology  of  Team  Performance  -  Human  team 
performance  measurement  in  C3  information  environ¬ 
ments,  performance  variables,  training  evaluation  and 
measuring  team  related  expertise,  management  of 
workload  and  stress. 

•  Unmanned  Systems  -  Principles  of  establishing 
performance  metrics  for  autonomous  systems 

•  Mixed  Initiative  Systems  -  Research  and  findings  on 
the  critical  variables  which  affect  human  decision  and 
control  of  autonomous  systems 

•  War  Fighting  Behavior  -  Observations  and  measure¬ 
ments  of  combat  team  performance  in  war  fighting 
tasks  C3  tasks 

We  have  integrated  and  adapted  theories  and  concepts  in  these 
areas  to  processes  associated  with  manned/unmanned  team 
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Figure  1  System  Performance  Model 


performance  and  training.  Most  critical  were  variables 
related  to  the  decision  making  behavior  of  the  unmanned 


systems,  such  as  behavior  transparency  to  the  human 
collaborators,  human  trust  in  robot  decisions  and  human 
abilities  to  synergize  the  autonomy  of  robots  so  as  to  add  to 
the  capability  of  the  total  team.  Issues  such  as  behavior 
prediction,  level  of  autonomy  and  acceptance  of  robots  actions 
have  also  been  examined  and  identified  for  possible  high 
impact  variable  on  total  system  performance. 

In  accord  with  this  approach,  we  have  created  a  preliminary 
System  Performance  Model  which  captures  the  critical 
performance  attributes  of  the  distinct  process  of  behavior 
composition  environment.  Our  objective  was  to  identify  the 
dimensions  of  performance  which  contribute  to  effective 
outcomes  of  collaborative  manned-unmanned  tasks  and,  in 
particular,  to  formulate  measures  to  evaluate  training  in 
processes  that  are  unique  to  the  collective  team  of  humans  and 
robots.  Accordingly,  we  have  built  a  taxonomy  of  specific 
processes  which  can  be  decomposed  into  explicit  behavioral 
objectives  side-by-side  with  measures  of  effectiveness  based 
on  actual  outcomes.  Our  focus  is  on  process  measures  that  are 
closely  linked  to  outcomes,  because  it  is  these  measures  that 
will  provide  the  feedback  necessary  for  training.  The  three 
levels  of  team  processes  critical  to  training  evaluation  and 
remediation  are:  (1)  individual  human;  (2)  team  human;  and 
(3)  collective  human/robot  team. 

We  decomposed  the  processes  into  these  three  levels  and 
developed  taxonomy  of  measures  for  each  level.  We  narrowed 
the  performance  measures  to  the  simplest  factor  structure  that 
adequately  cover  the  dimension  of  teamwork  as  was  found  in 
previous  investigators  [2].  The  actual  Performance  Model  will 
consist  of  a  multi-dimensional  task  process  performance 
schema  which  will  (1)  aggregate  the  performance  measures  at 
each  level,  (2)  provide  for  training  feedback  at  each  level,  and 
(3)  provide  a  multi- attribute  discriminate  function  to 
determine  an  overall  level  of  proficiency  as  well  as  a  “pass- 
fail’  score.  The  weights  of  the  attributes  will  be  established  in 
simulations  in  which  the  linkage  between  specific  task 
performance  measures  and  outcomes  can  be  estimated.  There 
are  two  main  types  of  measures:  Measures  of  Performance 
(MOP)  and  Measures  of  Effectiveness  (MOE);  these  are 
defined  separately  below. 

7.  Measures  of  Performance  (MOP) 

These  are  observable  and  derived  measures  of  the  operators’ 
task  skills,  strategies,  steps  or  procedures  used  to  accomplish 
the  task.  They  consist  of  the  cognitive  and  interactive 
processes  of  the  individual  and  team  in  collaborating  together 
and  controlling  the  robotic  entities  in  a  coordinate  manner. 
MOP  evaluates  the  human  factor  involved  in  a  complex 
system.  MOP  was  divided  into  3  distinct  classes  of  processes 
dimensions: 

•  Human  Team  Processes  -  These  processes  represent  the 
dimensions  of  the  human  team  interaction 


•  UV  Management  and  Control  Processes  -  These 
processes  represent  the  tasks  associated  with  real  time 
control  and  monitoring  of  the  autonomous  entities 

•  Human/Robot  Team  Processes  -  These  processes 
represent  the  dimensions  of  the  human  interaction  with 
the  robotic  elements 

2.  Measures  of  Effectiveness  (MOE) 

These  measure  the  “goodness’  of  the  composed  behavior  in 
quality  and  the  execution  of  war- fighting  tasks.  MOEs  are 
influenced  by  much  more  than  human  performance.  These 
measures  also  contain  variance  accounted  for  by  system 
design,  the  surrounding  environment  and  luck  [6].  The 
measure  consists  of  the  following  dimensions 

•  Mission  Effectiveness  -  Observable  measures  of  the 
success  of  the  mission  as  determined  by  objective  military 
criteria. 

•  Behavioral  Effectiveness  -  Measures  of  the  dimension  of 
behavioral  effectiveness  of  the  system  in  the  battlefield 

We  anticipate  that  only  a  relevant  and/or  application- specific 
subset  of  all  possible  performance  measures  will  be  used  in 
the  turnkey  MITPAS  because:  (1)  some  of  the  measures  may 
be  correlated;  and  (2)  the  selected  ones  will  require  assurance 
of  high  diagnostic  value,  which  is  referred  to  as  discrimination 
validity,  in  the  particular  situation.  In  our  future  laboratory 
tests  we  plan  to  reduce  the  possible  set  of  measures  to  a 
manageable  subset. 


4.  MITPAS  FUNCTIONAL  REQUIREMENTS 

Our  plan  is  to  implement  MITPAS  as  a  turnkey  software 
package  incorporating  three  major  capabilities: 

•  Tools  to  set  identify  and  specify  key  events  that  must  be 
included  in  an  exercise  in  order  to  stimulate  execution  of 
actions  by  participants  that  are  the  targets  of  performance 
measurements; 

•  Tools  to  capture  data  during  the  conduct  of  the  exercise, 
including  automated  extraction  from  data  loggers  and 
formats  for  observational  inputs  from  observers  and 
controllers; 

•  Analytical  tools  to  combine  the  data  collected  and 
produce  quantitative  measures  of  the  performance  and 
effectiveness  of  the  human-robotic  team(s)  being  studied; 

•  Report  generation  tools  to  allow  researchers  and  trainers 
to  produce  diagnostic  and  prescriptive  arrays  of  the 
analytic  products. 

We  will  also  build  initial  tactical  and  technical  databases, 
using  proposed  FCS  tables  of  organization  and  equipment  and 
similar  documents  from  other  UV  programs,  databases. 


Figure  2  below  diagrams  the  MITPAS  system  and  its  place 
within  the  training  and  evaluation  environments.  The  Figure 
focuses  on  MITPAS  as  an  adjunct  to  the  existing  distributed 
interactive  training  environment,  specifically  the  OneSAF 
Test  Bed  (OTB),  in  which  it  will  be  developed  and  initially 
evaluated.  Figure  2  also  expands  on  the  normal  context 
diagram  conventions  to  include  the  internal  components  of  the 
system  as  well,  highlighting  which  components  interact  with 
which  outside  entities. 


5.  MITPAS  ARCHITECTURE 

In  our  planned  future  efforts  we  will  complete  and  implement 
the  MITPAS  software  architecture,  developing  the  interfaces 
with  external  systems  and  user  interfaces  to  support 
identification  of  scenario  requirements,  selection  of  measures, 
monitoring  and  data  collection,  and  post-exercise  review  and 
analysis.  We  will  also  develop  the  analytical  engine  within 
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Figure  2  MITPAS  Components  and  Context 

In  its  initial  implementation  the  system  will  also  serve  as  the 
environment  in  which  candidate  measures  and  metrics  are 
tested  against  actual  exercise  performance  in  experiments  to 
identify  and  validate  those  measures  that  are  most  correlated 
with  and  predictive  of  successful  tactical  performance  and 
battle  outcome.  We  will  define  the  high-level  system 
functions  in  terms  of  Use  Case  Scenarios  and  Interaction 
Diagrams  for  the  various  types  of  users  as  well  as  for 
interactions  between  MITPAS  and  external  systems,  such  as: 

•  Military  Instructors  and  systems  performance  evaluators 

•  Unit  commanders  who  assign  and  monitor  mission  status 

•  System  Designers  and  Planners 


the  software,  and  as  the  performance  measurement  algorithms 
are  developed  they  will  be  embedded  in  that  component.  The 
development  of  components  will  be  done  iteratively,  in  a 
spiral  development  process,  providing  an  early  initial 
capability  for  experimentation,  and  evolving  as  experiments 
yield  more  data  about  performance  and  system  requirements. 
In  brief,  we  will  implement  a  MITPAS  Prototype  System  that 
will: 

•  Provide  a  Core  Infrastructure  for  measuring  the 
performance  of  Mixed-Initiative  exercises.  The  core 
infrastructure  is  designed  to  facilitate  the  rapid 
implementation  of  performance  measurement  and 
analysis  algorithms  as  well  as  to  enable  integration 


with  multiple  heterogeneous  simulation  and  test 
environments. 

•  Implement  the  specific  performance  measurement 
and  analysis  detailed  for  the  scenario  described  in 
this  proposal  using  the  Core  Infrastructure 

Careful  consideration  will  be  given  to  allow  the  system  to  be 
scalable  and  provide  extensive  integration  capabilities  to  meet 
evolving  performance  assessment  requirement  over  the  system 
life-cycle.  Critical  to  achieving  these  goals  is  the  use  of  a 
modular  component-based  software  architecture  which 
extensively  leverages  open  standards  and  de-facto  standard 
best  practices  in  distributed  system  development. 

Furthermore,  the  system  will  leverage  established  tools  and 
components  which  have  emerged  from  prior  DoD  investment 
in  modeling  and  simulation  as  well  as  independently 
developed  tools  for  collecting  and  analyzing  data  for  DIS  and 
HLA.  Additional  consideration  will  be  given  to  developing 
and  emerging  standards  in  the  training  and  simulation 
communities.  In  particular,  the  MITPAS  Core  Infrastructure 
will  be  designed  to  support  the  Test  and  Training  Enabling 
Architecture  (TENA)  under  development  for  PEO  STRI  as  a 
product  of  the  Foundation  Initiative  2010.  TENA  provides 
significant  improvements  on  HLA  and  is  designed  to  be  used 
with  embedded  training  systems  and  in  training  ranges. 


Figure  3  MITPAS  Architecture 


Figure  3  shows  the  main  MITPAS  system  architecture.  The 
system  is  comprised  of  the  following  core  components: 

•  MITPAS  Instructor  Console  -  An  application  to  set 
parameters  for  a  given  Mixed-Initiative  exercise  as 
well  as  construct  a  scenario 

•  MITPAS  Instrumentation  Run-Time  API  -  A 
middleware  toolkit  with  APIs  in  C  and  Java  to  enable 
rapid  instrumentation  of  entities  including  C4I 


Systems,  simulation  systems,  and  embedded  training 
systems 

•  HLA  Data  Logger  Interface  -  A  connection  to  an 
existing  data  capture  mechanism  for  capturing  and 
managing  data  from  an  HLA  data-stream 

•  MITPAS  AAR  Interface  -  An  application  which 
implements  the  analysis  and  reporting  capabilities  of 
the  system  as  well  as  invocation  of  Scenario  playback 

•  MITPAS  Augmented  HLA  FOM  -  Supports  capture 
of  additional  data  such  as  human  interaction  events, 
MITPAS  will  require  augmenting  of  a  particular 
HLA  Federation  Object  Model  to  add  the  additional 
classes  and  interactions. 


6.  Use  Case  Scenario 

We  will  use  scenario-based  training  trials  as  the  experimental 
paradigm  to  identify,  refine  and  validate  MITPAS  measures. 
Scenario-based  training  relies  on  controlled  exercises,  or 
vignettes,  in  which  the  target  training  audience  is  presented 
with  cues  that  are  similar  to  those  found  in  the  actual  task 
environment  and  then  given  performance  feedback.  In  mature 
training  environments  such  scenarios  are  developed  using 
training  and  doctrinal  materials  such  as  ARTEPS  and  Mission 
Training  Plans  along  with  validated  performance  measures.  In 
the  MITPAS  project,  however,  the  goal  is  to  identify  and 
validate  measures  for  a  type  of  unit  that  does  not  yet  exist  and 
for  whom  no  training  documents  have  been  developed. 
Accordingly,  we  have  developed  a  baseline  scenario  based  on: 

•  Examination  of  candidate  performance  measures 

•  Study  of  the  Future  Combat  System  2015  Unit  of 
Action  Design 

•  Sponsor  focus  on  countermine  capabilities 

Our  current  MITPAS  use-case  scenario  focuses  on  a  platoon 
of  a  Reconnaissance  Troop,  reinforced  with  Engineers,  which 
is  escorting  a  convoy  in  an  Iraq-like  environment.  The  platoon 
employs  UGVs,  SUGVs,  UAVs  (Type  3),  MULES  and  an 
ACRV,  which  allows  for  representation  of  a  wide  range  of 
robotic  capabilities  and  supports  experiments  focusing  on 
soldiers  controlling  individual  robots,  on  those  controlling 
multiple  homogeneous  or  heterogeneous  robots,  or  on  a  leader 
controlling  mixed  human  and  robotic  elements.  Our  current 
scenario  requires  subjects  to  deal  with  an  improvised 
explosive  device,  a  traditional  minefield,  small  unit  enemy 
action,  casualties,  and  maintaining  communications. 

We  are  able  to  identify  a  set  of  critical  control  events  within 
the  MITPAS  scenario  that  exemplify  the  type  of  mixed 
initiative  performance  we  are  trying  to  assess.  In  the  future 
these  critical  events  will  be  further  refined  in  cooperation  with 
our  RDECOM  PMs.  In  addition,  the  final  scenario  events  and 
candidate  performance  measures  will  then  mapped  to  each 
other  to  ensure  that  scenario  execution  will  elicit  the  actions 


that  the  measures  require.  Table  1  below  shows  an  initial  stage 
in  the  process,  in  which  measures  are  mapped  into  scenario 
events  based  on  the  current  findings.  The  purpose  here  is  to 
demonstrate  the  methodological  approach,  rather  than  provide 
an  exhaustive  listing,  which  will  form  part  of  the  planned 
future  effort. 

Table  1  Scenario  Events  vs.  Performance  Measures 


Phase  I  proposal  and  furtherer  analysis  validated  its  appli¬ 
cability  and  effectiveness. 

We  will  aggregate  the  individual  performance  measures  into  a 
scoring  criterion  by  starting  with  selected  ARTEPS  that  can  be 
adapted  to  human-robotic  collectives  (using  FCS  training 
studies  as  a  guide)  and  adding  additional  measures  such  as  the 
ones  discussed  above.  The  single-score-for-a-single-task 
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7.  Criteria  for  Success 

Our  approach  to  establishing  criteria  of  success  will  follow  the 
concepts  of  the  Army  Training  and  Evaluation  Program 
(ARTEP),  which  is  the  cornerstone  program  of  unit  training. 

Each  ARTEP  consists  of  defined  tactical  tasks  to  be 
performed  under  specified  conditions  to  a  criterion  or 
standard.  To  determine  if  the  standard  is  reached,  the  ARTEP 
provides  evaluators  with  a  list  of  Task  Steps  and  Performance 
Measures  scored  Go,  No  Go  or  Not  Evaluated.  The  ratio  of 
subjective  Go  to  No  Go  marks  and  the  significance  of  each 
determine  whether  the  performance  standard  has  been  met. 
While  the  Rates  have  evolved  over  decades  to  capture 
virtually  all-relevant  measures  of  performance  with  regard  to 
human  collectives,  collectives  of  humans  and  robots  will 
demand  the  exercise  of  additional  skills  by  the  human 
elements.  The  robots’  decisions  will  not  always  be  transparent 
to  the  humans.  Human  acceptance  of  these  decisions  will 
depend  on  understanding  the  robots’  capabilities  and 
anticipate  robot  behavior.  The  approach  was  proposed  in  our 


methodology  of  ARTEP  will  be  expanded  to  provide  a  single 
score  for  a  collective  patterns  of  tasks  We  propose  a  multi¬ 
dimensional  criterion  of  performance  success,  P,  that 
combines  the  direct  performance  measures  across  the  various 
experimental  (robot  system)  variables,  as  described  below: 

1 .  Let  x  be  the  pattern  of  performance  measures 

Xj  ^Xlj 5  X2j 5  X3j  ’  *  *  *  ’  XnJ  ^  under  the  various  conditions, 
i.e.  level  of  automation,  stress,  etc  marked  by  the  subscript  j 

The  multi-attribute  performance  score  for  condition  j  is: 
g(Xj )  =  WjXjj  +  WjX2j  +  WjX3j  +...  +  wnxnj, 

thus 

n 

*(*,)= 

i=l 


2.  To  get  a  total  score  cross  all  conditions  the  combined 
score  is 

m 

j=l 

3.  The  combined  aggregated  score  for  all  performance 
measures  and  condition  will  then  be: 

m  n 

Total  Score  P  =  II- 

j=l  i=l 

4.  The  weights  (w7  w29...9Wn)  will  be  determined  by 

testing  experiments  and  expert  judgment  using  a 
parameter  estimation  protocol  of  the  type  used  in 
trainable  pattern  recognizers. 

We  have  developed  a  schema  employing  a  factor  analytic 
approach  to  reducing  and  refining  the  set  of  measures  to 
reflect  underlying  orthogonal  performance  dimensions  [7]. 
This  strategy  will  be  employed  using  a  virtual  battlespace  to 
collect  data  for  analysis. 

The  scenarios,  candidate  measures  and  algorithms,  and  the 
OTB  V2.0  virtual  testbed  provide  a  framework  for  a  multi¬ 
stage  data  collection  effort  within  which  soldiers  with 
representative  background,  experience,  training,  and  skill 
levels  will  be  asked  to  execute  FCS  missions  as  part  of  a 
human-MULE  robot  team.  After  a  verification  and  validation 
effort  to  ensure  that  the  test  software  produces  the  intended 
data  products,  mission  trials  will  be  conducted  in  which 
soldiers  will  team  with  robots  to  perform  specific  assignments 
within  the  exercise  scenarios.  The  simulation,  instrumented 
with  the  selected  data  extraction  and  analysis  tool,  will 
produce  measure  data  for  each  of  the  candidate  measures 
constituting  the  independent  variables. 

Dependent  variable  data  will  come  from  a  different  source. 
The  Objective  Force  combat  development  community  will  be 
asked  to  provide  subject  matter  experts  to  observe  the  trials 
and  to  provide  subjective  evaluations  of  the  execution  of  the 
human-robot  team.  Accepting  the  expert  judgment  to  be  the 
reference  standard  for  performance  evaluation,  the  factor 
analysis  process  will  be  employed  to  examine  the  value  of  the 
component  and  composite  linear  factor  combinations  of  the 
candidate  measures  in  accounting  for  observed  performance. 
The  intent  is  to  seek  to  identify  a  reduced  set  of  orthogonal 
underlying  composite  measures  to  which  a  practically 
substantial  proportion  of  the  measure  variance  (in  relation  to 
expert  subject  judgment)  can  be  allocated.  Conceptually,  the 
process  can  be  thought  of  as  a  rotation  of  the  principal 
variable  axes  within  the  data  space  to  identify  a  new 
coordinate  set  that  minimizes  the  data  variance.  The  rotated 


axes  are  linear  combinations  of  the  original  set,  and 
correspond  to  underlying  variable  factors  suggested  by  the 
distribution  of  the  data  in  the  variable  space.  Factor  analysts 
often  look  upon  this  as  “first-stage  solution”  and  will  typically 
follow  this  with  further  non-orthogonal  rotations  to  achieve 
what  they  call  a  “simple  structure”.  For  our  purposes 
however,  this  will  not  be  advisable,  as  non-orthogonal  rotation 
has  implications  to  the  independence,  transformation,  and 
scaling  of  the  data. 


8.  Experimental  Plan 

Our  planned  experimental  test  program  is  structured  in  four 
parts.  Following  is  a  preliminary  description  of  each  phase; 
the  detailed  test  design  will  be  produced  during  the 
requirements  development  effort. 

7.  Laboratory  System  Pilot  Runs 

In  the  first  phase,  the  test  environment  will  be  set  up  and 
validated.  Pilot  runs  will  confirm  that  the  measurement 
algorithms  are  functioning  correctly,  that  the  scenario  is 
properly  simulated,  that  the  participating  virtual  platforms  and 
behaviors  representations  are  valid,  and  that  the  human 
operator  interface  is  fully  functional.  Pilot  runs  will  be 
conducted  to  confirm  that  the  design  is  fully  responsive  to  the 
requirements  of  the  program. 

2.  Model  Validation  and  Tuning 

The  second  phase  will  be  devoted  to  collecting  data  across  the 
spectrum  of  operations  in  the  scenario,  expert  observation  and 
evaluation,  and  reduction  of  the  measure  set  through  factor 
analysis.  The  focus  will  be  on  the  simplest  form  of  human- 
robot  team,  a  single  operator  supervising  the  activities  of  one 
or  two  robots.  The  scenario  will  be  executed  in  the  context  of 
FCS  embedded  individual  training  with  an  emphasis  on  what 
might  become  ARTEP/Drill  tasks  for  the  human-robot  team. 

3.  Battle  Operations  in  Simulation 

We  will  validate  the  reduced  measure  set  by  applying  it  to  a 
more  complex  set  of  activities  representative  of  FCS 
battlespace  operations.  The  scenario  will  involve  sequences 
of  the  types  of  tasks  that  formed  the  focus  for  phase  two,  and 
it  will  be  executed  by  a  small  team  consisting  of  two  or  more 
human  operators  and  several  virtual  robots.  This  will 
introduce  the  dimension  of  collaboration  and  allocation  of 
responsibilities  to  the  scenario  execution. 

4.  Field  Operation  with  Live  UVs 

As  an  option,  we  propose  in  a  fourth  phase  to  demonstrate  the 
operation  of  the  performance  measurement  system  in  a  live 


simulated  environment  using  instrumented  UVs  operating  on  a 
tactical  range. 


9.  Conclusions 

The  key  challenge  being  addressed  in  this  project  is  the  fact 
that  autonomous  vehicles,  or  agents,  will  need  to  interact  and 
coordinate  with  each  other  and  with  human  systems. 
Measurement  of  overall  effectiveness  in  these  mixed  initiative 
systems  will  be  essential  in  order  to  achieve  optimal  system 
performance  levels.  Behavioral  measures  of  both  human  and 
unmanned  combat  system  (UCS)  performance  obtained  in 
system  simulations  or  in  live  exercises  will  be  used  to 
continuously  diagnose  performance  and  identify  required 
areas  of  training  requirements  [3]. 

Likewise,  specialized  training  will  be  necessary  in  order  to 
leverage  the  complementary  cognitive  functions  of  human  and 
machine  to  forge  fighting  entities  and  units  with  capabilities 
superior  to  those  of  humans  or  machines  in  isolation. 
Embedded  training  is  also  projected  to  be  an  important  part  of 
the  Future  Combat  System  (FCS)  to  assure  that  performance 
levels  remain  high  during  all  operational  phases.  Overall,  a 
clear  and  definite  need  exists  for  methods  and  mechanism  to 
assess  and  determine  criteria  for  successful  performance  of 
unmanned  systems  and  manned/unmanned  teams  in  both 
training  environments  and  the  real  warfighting  situations. 

We  believe  that  meeting  this  need  will  also  lead  to  significant 
commercial  product  opportunities  in  the  large  and  rapidly 
expanding  military  and  non-military  markets  for  robotic 
systems.  The  focus  of  our  SBIR  commercialization  strategy 
will  be  transformation  of  the  MITPAS  prototype  into  a  suite  of 
software  modules  for  use  in  a  variety  of  mixed  initiative  and 
mobile  agent  applications.  The  software  product  will  be 
optimized  to  meet  military  and  non-military  market 
requirements.  It  will  be  sold  and/or  licensed  to  DoD  and 
Homeland  Defense  agencies  and  prime  contractors,  to  civil 
organizations  that  employ  remote  human  controlled  robotic 
agents  and  unmanned  vehicles  in  hostile  environments  and  for 
counter  terrorism  activities  and  local  law  enforcement,  and 
also  to  companies  manufacturing  and  distributing  industrial 
and  personal  robots.  In  addition,  we  plan  to  explore  in  Phase 
the  application  of  the  MITPAS  as  a  commercial  tool  for 
helping  military  and  non-military  emergency  response  teams 
determine  when  and  how  to  use  mixed  initiative  teams  on  a 
particular  type  of  mission,  e.g.,  in  a  bomb  disposal  situation. 
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